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Abstract. In this paper we present a new approach for the problem of 
approximating a function from a training set of I/O points using fuzzy 
logic and fuzzy systems. Such approach, as we will see, will provide us a 
number of advantages comparing to other more-limited systems. Among 
these advantages, we may highlight the considerable reduction in the 
number of rules needed to model the underlined function of this set of 
data and, from other point of view, the possibility of bringing interpre- 
tation to the rules of the system obtained, using the Taylor Series con- 
cept. This work is reinforced by an algorithm able to obtain the pseudo- 
optimal polynomial consequents of the rules. Finally the performance of 
our approach and that of the associated algorithm are shown through a 
significant example. 



1 Introduction 

The Function Approximation problem deals with the estimation of an unknown 
model from a data set of continuous input /output points; the objective is to 
obtain a model from which to get the expected output given any new input data. 

Fuzzy Logic on the other hand is one of the three roots of soft-computing; 
it has been successfully applied to several areas in scientific and engineering 
sectors, due to its broad number of benefits. The simplicity of the model and 
its undcrstandability, while encapsulating complex relations among variables is 
one of the keys of the paradigm. The other main characteristic is its capability 
to interpret the model, for example through the use of linguistic values to bring 
meaning to the variables involved in the problem. 

Many authors have dealt with Fuzzy logic and Fuzzy Systems for function 
approximation from an input/output data set, using clustering techniques as 
well as grid techniques, obtaining in general good enough results. Specifically, 
the TSK model [7] fits better to these kind of problems due to it's computational 
capability. 
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Fig. 1. a) MF distribution used for this example. Target function: y = (x - 10)2 b) 
Original function + model output + linear submodels for each of the three rules using 
a TSK model of order 1. We see how the global output of the TSK fuzzy system is 
eye- indistinguishable from the actual output, but no interpretation can be given to the 
three linear sub-models. 



The fuzzy inference method proposed by Takagi, Sugeno and Kang, which 
is known as the TSK model in fuzzy systems field, has been one of the major 
issues in both theoretical and practical research for fuzzy modelling and control. 
The basic idea is the subdivision of the input space into fuzzy regions and to 
approximate the system in each subdivision by a simple model. 

The main advantage of the TSK model is its representative power, capable 
of describing a highly complex nonlinear system using a small number of simple 
rules. In spite of this, the TSK systems suffer from the lack of intcrprctability, 
which should be one of the main advantages of fuzzy systems in general. While 
the general performance of the whole TSK fuzzy system give the idea of what 
it does, the sub- models given by each rule in the TSK fuzzy system can give no 
intcrpretable information by themselves [T] . See Fig [TJ 

Therefore, this lack of interpretability might force any researcher not to use 
the TSK models in problems where the interpretability of the obtained model 
and corresponding sub- models is a key concept. 

Apart from the intcrprctability issue the number of rules for a working model 
is also a key concept. For control problems, grid-based fuzzy systems are prefer- 
able since they cover the whole input space (all the possible operation regions in 
which the plant to control can be stated during its operation) . Nevertheless, for 
Mamdani fuzzy systems or even for TSK fuzzy systems of order 0 or 1 , although 
getting pseudo-optimal solutions, they usually need an excessive number of rules 
for a moderated number of input variables. 

In this paper we propose the use of high order TSK rules in a grid based 
fuzzy system, reducing the number of rules, while keeping the advantages of the 
grid-based approach for control and function approximation. Also to keep the 
intcrprctability of the model obtained, we present a small modification for the 
consequents of the high order TSK rules in order to provide the interpretability 
for each of the sub-models (rules) that compose the global system. 
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The rest of the paper is organized as follows: Section 2 presents high or- 
der TSK rules with an algorithm to obtain the optimal coefficients of the rule 
consequents. Section 3 provides an introduction to the Taylor Series Expansion, 
concept that will provide the key for the intcrprctability issue, commented in 
Section 4. Finally, in section 5 it is provided a whole example that demonstrates 
the suitability and goodness of our approach. 

2 High-Order TSK Fuzzy Rules 

The fuzzy inference system proposed by Takagi, Sugeno and Kang, known as the 
TSK model in the fuzzy system literature, provides a powerful tool for modelling 
complex nonlinear systems. Typically, a TSK model consists of IF-THEN rules 
that have the form: 

R k : IF x\ is A\ AND . . . AND x n is A k THEN 

y = a k + a\x x + ... + a k n x n (1) 

where the A\ are fuzzy sets characterized by membership functions A k (x), a k 
are real-valued parameters and Xi arc the input variables. 

A Sugeno approximator comprises a set of TSK fuzzy rules that maps any 
input data x = [x\, x%, ■ ■ ■ , x n ] into its desired output y £ Hi. The output of the 
Sugeno approximator for any input vector x, is calculated as follows: 

K 

E Vk(x)yk 

F(x) = ^ (2) 

fe=i 

Provided that ^tfc(a;)is the activation value for the antecedent of the rule fc, 
and can be expressed as: 

tx k (x) = A k 1 (x 1 )A k (x 2 )...A^(x n ) (3) 

The main advantage of the TSK model is its representative power; it is capa- 
ble of describing a highly nonlinear system using a small number of rules. More- 
over, since the output of the model has an explicit functional expression form, it 
is conventional to identify its parameters using some learning algorithms. These 
characteristics make the TSK model very suitable for the problem of function 
approximation; a high number of authors have successfully applied TSK systems 
for function approximation. For example, many well-known neuro- fuzzy systems 
such as ANFIS [I] have been constructed on the basis of the TSK model. 

Nevertheless very few authors have dealt with high-order TSK fuzzy systems. 
Buckley [5] generalized the original Sugeno inference engine by changing the form 
of the consequent to a general polynomial, that is: 



R k : IF xi is A\ AND . . . AND x n is A k THEN y = Y k (x) (4) 
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where Yk(x) is a polynomial of any order. Taking order 2, it can be expressed as 

(5) 



Y k (x) = Wq ■ x + -x 1 Wx 



Where wq is a scalar, w is a column vector of coefficients with dimension n 
(one per each input variable) and W is a triangular matrix of dimensions nxn, 



(Wij = coefficient for quadratic factor Xi * Xj 



1 



■n, j 



Now that we have defined how a TSK fuzzy system can be adapted to work 
with high-order rules, let's see, given a set of input/output data, and a con- 
figuration of membership functions for the input variables, how to adapt the 
consequents of the rules so that the TSK model output optimally fits the data 
set D. The Least Square Error (LSE) algorithm will be used for that purpose. 
LSE tries to minimize the error function: 



j=Y, (vm-F(x)y 



(6) 



meD 



where F is the output of the TSK fuzzy system as in (2). Setting to 0 the first 
derivative (7, 8) of each single parameter (wq and each component of w and W) 
will give us a system of linear equations from which to obtain the optimal values 
of the parameters. 



dJ 



2 E 

meD 
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^k(x m )-f n , ki -ii s {x m )-f u , si (x rn ) 

K 

E ^j( x ™) 

3 = 1 



(8) 



Where Wki -rule k, coefficient i- is the coefficient we are differentiating in 
each case (wq or any component of W or w) , and f W i is the partial derivative of 
the consequent of rule k with respect to w,, i.e., 1 for the 0-order coefficient wq, 
Xi for every first-order coefficient Wi , or x p ■ Xj for every second-order coefficient 
w p j of W. 

Once we have the system of linear equations, it only remains to obtain the 
optimal solution for all the coefficients of every rule. The Orthogonal Least- 
Square (OLS) method [B] will guarantee a single optimal solution obtaining the 
values for the significant coefficients while discarding the rest. We reject therefore 
the problems due to the presence of redundancy in the activation matrix. 

Once that we have already reviewed the type of rules that we are going to 
operate with, now let's review the "lack of interpretability curse" that suffer TSK 
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Fuzzy Systems as we saw in Section 1. As polynomials are not easy interpretable 
as consequents of the rules, we will give now the key for the intcrprctability for 
our Taylor-Series based rules. 



3 Taylor Series-Based Fuzzy Rules (TSFR) 

Let fix) be a function defined in an interval with an intermediate point a, for 
which we know the derivatives of all orders. The first order polynomial: 

Pl (x) = f(a) + f'(a)(x-a) (9) 

has the same value as f{x) in the point x = a and also the same first order 
derivative at this point. Its graphic representation is a tangent line to the graph 
of fix) at the point x = a. 

Taking also the second derivative for fix) in x = a, we can build the second 
order polynomial 

P2 ix) = fia) + fia)ix -a) + \f{a){x - a) 2 (10) 

which has the same value as f(x) at the point x = a, and also has the same 
values for the first and second derivative. The graph for this polynomial in x = a, 
will be more similar to that of fix) in the points in the vicinity of x = a. We 
can expect therefore that if we build a polynomial of nth order with the n first 
derivatives of fix) in x = a, that polynomial will get very close to fix) in the 
neighbourhood of x = a. 

Taylor theorem states that if a function fix) defined in an interval has deriva- 
tives of all orders, it can be approximated near a point x = a, as its Taylor Scries 
Expansion around that point: 

fix) = fia) + fia)ix -a) + i/"(a)(x - a) 2 + . . . 

+ l/(")(a)(z - a)" + —L-f<.»+U( c ){x - a) n+1 (11) 

where in each case, c is a point between x and a. 

For n-dimcnsional purposes, the formula is adapted in the following form: 



fix) = fia) + ix~a? 



1^(3)1 +\ix~a) T Wix~d) + 

ax i Ji=l...n 1 

y W 3 ix - a, x - a, x - a) + . . . (12) 



where W is a triangular matrix of dimensions nxn, and W s is a triangular 
multi-linear form in s vector arguments v 1 , . . . , v s . 

Taylor series open a door for the approximation of any function through 
polynomials, that is, through the addition of a number of simple functions. It 
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is therefore a fundamental key in the field of Function Approximation Theory 
and Mathematical Analysis. Taylor Series Expansion will also provide us a way 
to bring interpretation to TSK fuzzy systems by taking a certain type of rules 
consequents and antecedents, as we will now see. 

As noted in 3 J we will use input variables in the antecedents with membership 
functions that form an Orderly Local Membership Function Basis (OLMF). The 
requirements that a set of membership functions for a variable must fulfil to be 
an OLMF basically are: 

— Every membership function extreme point must coincide with the centre of 
the adjacent membership function. 

— The n-th derivative of the membership function is continuous in its whole 
interval of definition. 

— The n-th derivative of the membership function vanishes at the centre and 
at the boundaries. 

The main advantage of using this kind of membership functions is the dif- 
ferentiability of the output of the TSK fuzzy system. This is not possible when 
we have triangular or trapezoidal membership functions, since the derivative at 
the centres of the membership functions does not exist, therefore not having a 
diffcrcntiablc fuzzy system output. 

These OLMF bases also have the addition to unity property: the addition of 
the activations of all the rules is always equal to unity for any point inside the 
input domain in a TSK fuzzy system that keeps the OLMF basis restrictions. 
Therefore the output of the TSK fuzzy system can be expressed as: 



Then the OLS method cited in Section 2 will work well for the given system, 
and can identify the optimal coefficients without needing another execution of 
the algorithm as noticed in [§]. 

Finally, given that the input variables have a distribution of membership 
functions that form a OLMF basis, we will use high-order TSK rules in the form 
(4), but where the polynomial consequents are in the form: 



being the centre of rule k, therefore forming a Taylor Series Expansions around 
the centres of the rules. 

4 Interpretability Issues 

It can be demonstrated j3] that given a Sugeno approximator F(x) such that: 

— 1) the input variables membership functions form a set of OLMF basis of 
order m (being the m-th derivative continuous everywhere); 



K 




(13) 



Yfc(f) = w% ■ w k (x-a k ) + -(x - a k ) T ■ W k ■ (x-a k ) 



(14) 



514 



L.J. Herrera et al. 



— 2) the consequent-side is written in the rule-centred form shown in (4) and 
(14) and the polynomials Yk(x) are of degree n. 

Then for n < m, every Yk(x) can be interpreted as a truncated Taylor series 
expansion of order n of F(x) about the point x — , the centre of the fcth rule. 

Supposing therefore that we have a method to obtain the optimal Taylor- 
Series Based TSK rules consequents coefficients for function approximation, 
given a data set and a membership function distribution that form a set of 
OLMF basis, we can interpret then the consequents of the rules Yk(x) as the 
truncated Taylor series expansion around the centres of the rules of the out- 
put of the system. This system also provides a pseudo-optimal approximation 
to the objective function. In the limit case where the function is perfectly ap- 
proximated by our system, the rule consequents will coincide with the Taylor 
Scries expansions of that function about centre of each rule, having reached total 
intcrpretability and total approximation. 

In Marwan Bikdash used directly the (available) Taylor Series Expansion 
of the function around the rule centres, for each rule, to approximate the function 
with the TSK fuzzy system. Notice that these rule consequents, though having 
strong interpretability, are not the optimal consequents in the least squares sense. 
Please note that the Taylor Series Expansion is an approximation for a function 
in the vicinity of the reference point. Therefore even using a high number of 
MFs, the error obtained by the method in [3] is seldom small enough (compared 
to a system with similar complexity with consequents optimized using LSE) and 
therefore the system output barely represent a good approximation of the data 
we arc modelling. 

In this paper we also suppose that the only information we have from the 
function to approximate are the input/output points in the initial dataset. No 
information is given of the derivatives of the function w.r.t. any point. Also, 
there is no accurate way to obtain the derivatives from the training points to 
perform the approximation as the method in [3] required. 

5 Simulations 

Consider a set of 100 randomly chosen I/O data from the 1-D function [2|: 



Let's try now to model those data using a fuzzy system with 5 membership func- 
tions for the single input variable x forming a OLMF basis and rule consequents 
of the form given by (14), being Yfc a order-2 polynomial. 

The five rules obtained after the execution of the LSE algorithm using OLS 
are the following: 



F(x) 



e~ 5x sin(2Trx) e [0,1] 



(15) 



IF x is A i THEN y 
IF x is A 2 THEN y 
IF x is A 3 THEN y 
IF x is A 4 THEN y 
IF x is A 5 THEN y 



-26.0860x 2 + 5.3247x + 0.0116 

-1.6235(2; - 0.25) + 0.2882 

4.3006(.t - 0.5) 2 - 0.5193(:r - 0.5) 

-1.1066(2: - 0.75) 2 + 0.1780(a; - 0.75) - 0.0238 

0 



(16) 
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c) 



d) 



Fig. 2. a) Original lunction (solid line) and Taylor Series Expansion based Fuzzy Sys- 
tem (dotted line). We see that for only 5 membership functions, the output of the 
system is very similar to the original function. NRMSE = 0.0154. b) Original function 
+ model output + second membership function consequent (centered at x=0.25). c) 
Original function + model output + third membership function consequent (centered 
at x=0.5). d) Original function + model output + fourth membership function conse- 
quent (centered at x=0.75). We see clearly how these polynomials come closer to the 
Taylor Series Expansion around the centre of the rules of the fuzzy system output. 



The interpretability comes from the fact that the function in the points near 
to each center of the five rules is extremely similar to the polynomial output of 
the rules as shown in Figure [2 These polynomials are kept expressed as Taylor 
Scries Expansions of the function in the points in the vicinity of the centres of 
the rules. The system is therefore fully intcrpretable and also brings some more 
advantages as noticed below. 

Figure |2] also shows clearly that the LSE finds the optimal consequents coef- 
ficients for the given input/output data set. Also it must be noted that for only 
five rules (one per each membership function), the error obtained is sensibly low. 
If we compare the system obtained for the same number of rules with a TSK 
fuzzy system with constant consequents, we sec that the error obtained (NRMSE 
= 0.3493) is very high comparing to our Taylor Series Expansion based fuzzy 
system (NRMSE = 0.0154). 
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It should be remembered that the Normalized Root-Mean Square Error 
(NRMSE) is defined as: 



where cr 2 , is the variance of the output data, and e 2 is the mean-square error 
between the system and the dataset D output. 

Also comparing using the same number of parameters, that is, 5 rules for our 
system, 15 rules for constant consequents TSK rules, we observe that the error 
obtained by our Taylor-Based rules system is much lower (NRMSE = 0.0154) 
than for constant consequent rule system (NRMSE = 0.0635). 

6 Conclusions 

In this paper we have presented a very interesting approach to the problem 
of function approximation from a set of I/O points utilizing a special type of 
fuzzy systems. Using an Orderly Local Membership Function Basis (OLMF) and 
Taylor Series-Based Fuzzy Rules, the proposed fuzzy system has the property 
that the Taylor Series Expansion of the dcfuzzificd function around each rule 
centre coincides with that rule's consequent. This endows the proposed system 
with both the approximating capabilities of TSK fuzzy rules through the use of 
the OLS algorithm, and the interpret ability advantages of pure fuzzy systems. 
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